Safety

Overview

The safety policy can be used to detect user-provided inputs or model responses that contain unsafe content, such as obscene, toxic, or harmful content.

Input Safety Policy Definition

Dynamo AI uses the following policy definition for detecting unsafe content.

Description: Safeguard chat bot from queries with unsafe content, context, or intent.
Disallowed:
- Illegal or criminal activities
- Violent, harassment, or hurtful topics
- Hateful, derogatory, or discriminative speech
- Dangerous, unethical, or harmful topics
- Encourage or provide methods of self-harm or suicide
- Abusive, biased, or unfair remarks
- Sexual references including but not limited to sexual innuendos, acts, devices, and favors
- Sexist or racist discourse including implicit and explicit discrimination or stereotyping
- Untruthful, manipulative, or misleading statements
- Create, obtain, or operate weapons
- Procurement or use of harmful substances
- Vulgar or offensive language
- Child harm topics
- Malware topics
- Privacy violation topics
Allowed:
- General harmless queries
- Requests for responsible information on violence and discrimination
- Requests for responsible sexual education, health, or consent
- Requests for factual resources for mental health
- Queries on resources for managing conflicts and reporting harassment
- Queries on promoting diversity, fairness, and inclusion
- Queries on responsible, harmless, and safe information on substances"
- Asks for explanations on ethical and responsible behavior

Output Safety Policy Definition

Coming soon.

Safety Policy Actions

You can manage what happens to inputs and outputs when applying the safety policy using the actions below:

Flag: flag content for moderator review
Block: block user inputs or model outputs containing unsafe content

Overview​

Input Safety Policy Definition​

Output Safety Policy Definition​

Safety Policy Actions​

Overview

Input Safety Policy Definition

Output Safety Policy Definition

Safety Policy Actions